Towards a Big Data Reference Architecture
نویسندگان
چکیده
Technologies and promises connected to ‘big data’ got a lot of attention lately. Leveraging emerging ‘big data’ sources extends requirements of traditional data management due to the large volume, velocity, variety and veracity of this data. At the same time, it promises to extract value from previously largely unused sources and to use insights from this data to gain a competitive advantage. To gain this value, organizations need to consider new architectures for their data management systems and new technologies to implement these architectures. In this master’s thesis I identify additional requirements that result from these new characteristics of data, design a reference architecture combining several data management components to tackle these requirements and finally discuss current technologies, which can be used to implement the reference architecture. The design of the reference architecture takes an evolutionary approach, building from traditional enterprise data warehouse architecture and integrating additional components aimed at handling these new requirements. Implementing these components involves technologies like the Apache Hadoop ecosystem and so-called ‘NoSQL’ databases. A verification of the reference architecture finally proves it correct and relevant to practice. The proposed reference architecture and a survey of the current state of art in ‘big data’ technologies guides designers in the creation of systems, which create new value from existing, but also previously under-used data. They provide decision makers with entirely new insights from data to base decisions on. These insights can lead to enhancements in companies’ productivity and competitiveness, support innovation and even create entirely new business models.
منابع مشابه
The GOBIA Method: Towards Goal-Oriented Business Intelligence Architectures
Traditional Data Warehouse (DWH) architectures are challenged by numerous novel Big Data products. These tools are typically presented as alternatives or extensions for one or more of the layers of a typical DWH reference architecture. Still, there is no established joint reference architecture for both DWH and Big Data that is inherently aligned with business goals as implied by Business Intel...
متن کاملThe GOBIA Method: Fusing Data Warehouses and Big Data in a Goal-Oriented BI Architecture
Traditional Data Warehouse (DWH) architectures are challenged by numerous novel Big Data products. These tools are typically presented as alternatives or extensions for one or more of the layers of a typical DWH reference architecture. Still, there is no established joint reference architecture for both DWH and Big Data that is inherently aligned with business goals as implied by Business Intel...
متن کاملTowards a Security Reference Architecture for Big Data
Companies are aware of Big Data importance as data are essential to conduct their daily activities, but new problems arise with new technologies, as it is the case of Big Data; these problems are related not only to the 3Vs of Big Data, but also to privacy and security. Security is crucial in Big Data systems, but unfortunately, security problems occur due to the fact that Big Data was not init...
متن کاملA Reference Architecture for Big Data Solutions
With big data technology and predictive analytics techniques, organizations can now register, combine, process and analyze data to answer questions that were unsolvable a few years ago. This paper introduces a solution reference that gives guidance to organizations that want to innovate using big data technology and predictive analytics techniques for improving their performance. The reference ...
متن کاملAn Architecture for Security and Protection of Big Data
The issue of online privacy and security is a challenging subject, as it concerns the privacy of data that are increasingly more accessible via the internet. In other words, people who intend to access the private information of other users can do so more efficiently over the internet. This study is an attempt to address the privacy issue of distributed big data in the context of cloud computin...
متن کامل